string.h
string.h
mainly defines string handling functions and memory manipulation functions.
String handling functions
The following string handling functions are detailed in the `String' chapter.
- strcpy(): copies a string.
- strncpy(): copies a string, with a length limit.
- strcat(): concatenate two strings.
- strncat(): concatenate two strings, with length limit.
- strcmp(): compare two strings.
- strncmp(): compares two strings, with a length limit.
- strlen(): returns the number of bytes in a string.
strchr(), strrchr()
Both strchr()
and strrchr()
are used to find the specified character in a string. The difference is that strchr()
starts at the beginning of the string and strrchr()
starts at the end of the string, and the extra r
in the function name means reverse.
char* strchr(char* str, int c);
char* strrchr(char *str, int c);
They both take two arguments, the first being a pointer to the string and the second being the character to be found.
Once the character is found, they stop the lookup and return a pointer to the character. If it is not found, NULL is returned.
Here is an example.
char *str = "Hello, world!";
char *p;
p = strchr(str, ','); // p points to the comma position
p = strrchr(str, 'o'); // p points to the position of o inside world
strspn(), strcspn()
strspn()` is used to find the length of a string that belongs to the specified character set,
strcspn()` is the opposite, and is used to find the length of a string that does not belong to the specified character set.
size_t strspn(char* str, const char* accept);
size_t strcspn(char * str, const char * reject);
These two functions take two arguments, the first being the source string and the second being a string consisting of the specified characters.
strspn()
starts looking at the beginning of the first argument, and stops once it finds the first character that is not in the range of the specified character set, returning the length of the string so far. If there are always no characters not in the specified character set, the length of the first argument string is returned.
strcspn()
, on the other hand, stops the lookup once it finds the first character that falls within the specified character set and returns the length of the string so far. If no characters of the specified character set are ever found, the length of the first argument string is returned.
char str[] = "hello world";
int n;
n = strspn(str1, "aeiou");
printf("%d\n", n); // n == 0
n = strcspn(str1, "aeiou");
printf("%d\n", n); // n == 1
In the above example, the first n
is equal to 0 because the character h
at position 0 would not belong to the specified character set aeiou
, which can be interpreted as having 0 characters at the beginning belonging to the specified character set. The second n
is equal to 1, because the character e
at position 1 belongs to the specified character set aeiou
, which can be interpreted as having 1 character at the beginning that does not belong to the specified character set.
strpbrk()
strpbrk()
searches the string for any character in the specified character set.
char* strpbrk(const char* s1, const char* s2);
It accepts two arguments, the first being the source string and the second being a string consisting of the specified characters.
It returns a pointer to the first matching character, or NULL if no matching character is found.
char* s1 = "Hello, world!";
char* s2 = "dow!";
char* p = strpbrk(s1, s2);
printf("%s\n", p); // "o, world!"
In the above example, the specified character set is "dow!", so the first matching character in s1
is the "o" in "Hello", so the pointer p
points to this character. The output will be "o, world!" from this character to the end of the string.
strstr()
strstr()
looks inside a string to find another string.
char *strstr(
const char* str,
const char* substr
);
It accepts two arguments, the first being the source string and the second being the substring to be found.
If the match is successful, it returns a pointer to the substring inside the source string. If the match fails, NULL is returned, indicating that the substring could not be found.
char* str = "The quick brown fox jumped over the lazy dogs.";
char* p = strstr(str, "lazy");
printf("%s\n", p == NULL ? "null": p); // "lazy dogs."
In the above example, strstr()
is used to find the substring lazy
inside the source string str
. From the returned pointer to the end of the string, that is `lazy dogs.
strtok()
strtok()
is used to break up a string into a series of tokens, according to the specified delimiter.
char* strtok(char* str, const char* delim);
It accepts two arguments, the first being the string to be broken up and the second being the specified delimiter.
It returns a pointer to the first lexical element to be disassembled and replaces the delimiter at the end of the lexical element with the string ending flag 0
. It returns NULL if there are no lemmas to be decomposed.
If you want to iterate over all lexical elements, you must call it in a loop, see the following example.
The first argument of strtok()`, if it is NULL, means that the decomposition continues from where the last
strtok()` decomposition ended, downwards.
#include <stdio.h>
#include <string.h>
int main(void) {
char string[] = "This is a sentence with 7 tokens";
char* tokenPtr = strtok(string, " ");
while (tokenPtr ! = NULL) {
printf("%s\n", tokenPtr);
tokenPtr = strtok(NULL, " ");
}
}
The above example takes the source string and breaks up the lexical elements by spaces. It produces the following output.
This
is
a
sentence
with
7
tokens
Note that strtok()
modifies the original string by replacing all separators with the string ending symbol \0
. It is therefore better to generate a copy of the original string and then execute strtok()
on this copy.
strcoll()
strcoll()
is used to compare two strings with localisation settings enabled, and is used in essentially the same way as strcmp()
.
int strcoll(const char *s1, const char *s2);
See the following example.
setlocale(LC_ALL, "");
// report é > f
printf("%d\n", strcmp("é", "f"));
// report é < f
printf("%d\n", strcoll("é", "f"));
The above example compares é
with an accented symbol with f
. strcmp()
returns é
greater than f
, while strcoll()
then correctly identifies that é
comes before f
and is therefore less than f
. Note that the localization setting needs to be enabled before the comparison can be made, using setlocale(LC_ALL, "")
.
strxfrm()
strxfrm()` converts a localized string into a form that can be compared using
strcmp(), equivalent to the first part of the operation inside ``strcoll()
.
size_t strxfrm(
char * restrict s1,
const char * restrict s2,
size_t n
);
It takes three arguments, converts the second argument s2
into a form that can be compared using strcmp()
, and stores the result in the first argument s1
. The third argument, n
, is used to limit the number of characters written to prevent exceeding the bounds of s1
.
It returns the length of the converted string, excluding the terminator at the end.
If the first argument is NULL and the third argument is 0, then no actual conversion is performed and only the required length of the converted string is returned.
The following example uses this function to implement a `strcoll()
itself.
int my_strcoll(char* s1, char* s2) {
int len1 = strxfrm(NULL, s1, 0) + 1;
int len2 = strxfrm(NULL, s2, 0) + 1;
char *d1 = malloc(len1);
char *d2 = malloc(len2);
strxfrm(d1, s1, len1);
strxfrm(d2, s2, len2);
int result = strcmp(d1, d2);
free(d2);
free(d1);
return result;
}
In the above example, first allocate converted storage space for the two localized strings to be compared, use strxfrm()
to convert them to a comparable form, and then use strcmp()
to compare them.
strerror()
The strerror()
function returns the description string for a particular error.
char *strerror(int errornum);
Its argument is the number of the error, as defined by errno.h
. The return value is a pointer to the description string.
// output No such file or directory
printf("%s\n", strerror(2));
The above example outputs the description character "No such file or directory" for error number 2.
The following example is a custom error message.
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(void) {
FILE* fp = fopen("NONEXISTENT_FILE.TXT", "r");
if (fp == NULL) {
char* errmsg = strerror(errno);
printf("Error %d opening file: %s\n", errno, errmsg);
}
}
In the above example, the current default error message is obtained via strerror(errno)
, where errno
is a macro defined by errno.h
, indicating the current error number. Then, a custom error message is output.
Memory manipulation functions
The following memory manipulation functions are described in detail in the chapter on Memory Management.
- memcpy(): memory copy function.
- memmove(): the memory copy function (overlap allowed).
- memcmp(): compares two memory regions.
memchr()
`memchr()
is used to find the specified character in a memory region.
``c void memchr(const void s, int c, size_t n);
It accepts three arguments, the first being a pointer to the memory region, the second being the character to be found, and the third being the length in bytes of the memory region.
Once found, it stops the lookup and returns a pointer to that location. If the specified character is still not found until the specified number of bytes have been checked, NULL is returned.
Here is an example.
```c
char *str = "Hello, world!";
char *p;
p = memchr(str, '!' , 13); // p points to the position of the exclamation mark
``''
### memset()
``memset()` formats a section of memory in its entirety to the specified value.
```c
void* memset(void* s, int c, size_t n);
Its first argument is a pointer to the start of the memory area, its second argument is the value of the character to be written, and its third argument is an integer indicating the number of bytes to be formatted. It returns the first argument (the pointer).
memset(p, ' ', N);
In the above example, p is a pointer to a region of memory of length N bytes. `memset()
rewrites each byte of that memory area, as a space character.
Here is another example.
char string1[15] = "bbbbbbbbbbbbbbbbbbb";
// output bbbbbbbbbBBBBBBBBB
printf("%s\n", (char*) memset(string1, 'b', 7));
An important use of memset()
is to initialize all array members to 0.
memset(arr, 0, sizeof(arr));
The following is an example of initialising a Struct structure all to 0.
struct banana {
float ripeness;
char *peel_color;
int grams;
};
struct banana b;
memset(&b, 0, sizeof b);
b.ripeness == 0.0; // True
b.pel_color == NULL; // True
b.grams == 0; // True
The above example initializes all properties of the instance b of Struct banana to 0.
Other functions
void* memset(void* a, int c, size_t n);
size_t strlen(const char* s);